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Donald P. Gaver 



1 . Introduction 

Consider a collection of J entities called "units" that inde- 
pendently generate events in accordance with Poisson processes, 
each with rate parameter A ^ . We are in possession of observations 

on each of these processes, and have seen Sj events for a time 

t h 

exposure of t. for the j — , j = 1,...,J. Possibly also available 
are concommitant observations on other variables x that may in 
part influence (explain) the values of the rates A ^ . The problem 
is to use these observations to describe the nature and extent of 
the variation between individual unit rates, and on this basis to 
predict (a) the future event generation behavior of individual units 
under observation, as well as (b) the overall rate variability of 
existing units, and hence the likely rate behavior of other, similar, 
units not yet under observation. The object of this paper is to 
propose and examine statistical methods for approaching the above 
problems. The approach emphasized is to treat the unknown rates 
as being describable in part as coming from a fixed population of 
possible rates, and then to describe or assess that population and 
its implications for estimating the individualized unit rates. The 
approach is called hierarchical because each rate may be viewed 
as a realization of some random variable associated with a higher- 
level super population of rates; such models are also called random 
parameter or parametric empirical Bayes models; see Morris (1983) 
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for a review. There have been a variety of applications of similar 
models in many fields. However, particular emphasis is given in 
this paper to analyses that invoke discrepancy tolerant superpopu- 
lation parametric representations: ones yielding estimating pro- 

cedures that may assist in identification of distinct rate groupings, 
existence of apparently discrepant or outlying rates, etc., a 
better understanding of which could suggest desirable improvements 
for systems so identified. Of course this latter step may well 
lead in practice to a change in the superpopulation, and to need 
for an updated new analysis. The steps suggested resemble the cycle 
of data analysis and modelling, model diagnosis by residual and 
sensitivity analysis, and repeat, often adopted in enlightened 
regression analyses; cf. Mosteller and Tukey (1977), Belsley, Kuh, 
and Welsch (1980), and elsewhere. Some ideas of discrepancy-tolerant 
or robust Bayesian analyses have been described by Berger (1980), (1984), 

who references Albert (1979) for as-yet unpublished studies. Ideas 
expressed in the paper of Box (1980) , with discussion, are quite 
relevant, as well. 

This paper proceeds by first introducing hierarchical Poisson 
models. Specification of useful parametric forms for the superpopu- 
lation that describes between-unit variability is the next topic; 
this is followed by a discussion of explicit adjusted estimates for 
individual event rates in terms of superpopulation parameters. 

Finally, some procedures are described for obtaining estimates of 
the superpopulation parameters. The estimation procedure effec- 
tiveness is assessed by simulation, and the technology is applied 
to certain sets of real data. 
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2 . The Hierarchical Poisson Model 

Introduce as a starting point the Poisson process of events 

for item j with (conditional) event rate Aj(Xj,Ej) where 

x. = (x, . , x_ .,..., x . ) is a vector of explanatory (regression) 

D J ^ J P J 

variables, and is (the realization of) a random variable with 

fixed density f (x;£) ; the latter describes an infinite super- 

th 

population with the parameters vector 0 . The value of the j — 

(1 £ j £ J) latent variable is here taken to be fixed for all 

time, once drawn from the superpopulation. It is thus a random 

individualization of the failure rate of item j; while x. values 
~3 

account for differences in environmental factors. Generally, the 
superpopulation distribution accounts for the rate variations be- 
tween items or individuals after adjustment for environmental effects 
explained by x ^ . Of course the manner in which the explanatory 
variables are used can influence the form of the apparent super- 
population distribution, and the selection of items for study, if 
influenced by the rate values, can bias the estimation of super- 
population parameters; see Lehoczky (1984). 

Contrast the above model types to those in which A ^ is a random 
function of time: e.g. in discrete time, monthly or weekly perhaps, 

the rate changes in accordance with A., = A.(x.,e..), and 

It 3 ~3 3 t 

{e. t , t = 1,2,...} is a collection of possibly iid random variables, 
or perhaps a more general stochastic process changing in discrete 
or continuous time; these last are called "random environment" 
models, or more specifically doubly stochastic Poisson models, see 
Cox and Lewis (1966) , Cox and Isham (1980) , Gaver (1963) , Reynolds 
and Savage (1971), and Burridge (1981). For instance, if the 
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t 

integrated hazard 

0 

gamma stochastic process, then the original Poisson process becomes 
a negative binomial process. Other interesting models would allow 
for changes in the superpopulation as a result of event observation 
and remedial action. Consideration of all of these latter is 
beyond the scope of this study; this paper confines its attention 
to the simplest random individualization hierarchical structures. 



A ! dt ' = 
Dt 



A j (t) 



is the realization of a 
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3 . Two Classical Hierarchical Models: The Log-Normal and Gamma 

Super -populations 

In order to parameterize a superpopulation of even rates in a 
simple fashion, both the log-normal and the gamma distribution 
have been utilized historically. Here is a brief discussion, with 
modifications reserved for the following section. 

3.1 Log-Normal Superpopulation: The L/N/P (Log-Normal Poisson) Model 

Suppose that the Poisson rate of item j is of the form 



A j = expUj) j = 1,2,. . . ,J 



(3.1) 



where ~ N(p^,o ) . Let p^ = p + x^8_, x^ being a vector of covari- 
ates and £ a vector of regression coefficients, whenever the regression 
term is present; otherwise p^ = p. This paper does not consider the 
fitting of regression coefficients. Each of the J items is exposed 
for known time t ^ , with s^ (= 0,1,2,...) observed events recorded therein. 

This (log-normal) model is expecially popular in the Proba- 
bilistic Risk Assessment (PRA) of nuclear reactor safety and 
operating systems, see the Reactor Safety Study (WASH 1400) (1975), 

and subsequent numerous reports on this topic; in particular 
note Kaplan (1983) . Items may be in-plant equipments such as 
continuously acting pumps, valves, and control devices that are 
subject to failure events; other events of concern are so-called 
initiating events such as loss of feedwater, pipe breaks, loss of 
offsite power and other challenges to the integrity of a nuclear-- 
or other--plant ' s safe and productive operation. The failure or 
initiating event occurrences may initially be taken to be time- 
homogeneous Poisson stochastic processes, with rates that vary 
between design copies in accordance with environmental influences 
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(manufacturer, geographical location, plant type, (sub) system type, 
etc.). A full analysis endeavors to estimate the influence of 
the explanatory environmental variables as represented by a re- 
gression function, and the properties of the random individualiza- 
tions, e j , so as to provide (i) good estimates of the superpopulation 
center (mean) and variability (variance) and (ii) good estimates 
of the individual item failure rates. An individual rate estimate 
constructed from its own experience alone may be subject to con- 
siderable random error; pooling its data with that of other similar 
items can reduce that random error at the possible expense of 
adding bias. Our methods suggest pooling that is reasonably based 
on superpopulation characteristics as estimated under (i), assuming 
that the superpopulation is (log) normal; in a later section esti- 
mates are developed that depend less on such a special form, i.e. 
that pool more selectively. 

Given the model (3.1) a revised, pooled or shrunken, estimate 
of the rate associated with item j can be obtained by Bayes' formula, 
perhaps in the form 



A. = E [ A • I s . , x . 
3 ~3 3 ~3 



1 y_M i 2 

2'o -A (y ) t . s. 

= K. / A(y)e e 3 ( A (y) ) 3 dy , 

3 _oo 

(3.2) 



where A(y) = exp(y) , y = P + x ^ , and is a normalizing con- 
stant. The integral sometimes may be well approximated by use of 
Laplace's method, see Tierney and Kadane (1984). However, a 
likelihood approach provides quick and interpretable results: 

A A 

choose e . = £n A . to maximize 
3 3 
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L j ( e j ; m / o 



1 2 

-_(_2 1 ) s -j 

2 o -A (e . ) t . ( A (e . ) t . ) 3 

e 3 3 3 3 



/2tto 



s . ! 
3 



( 3 . 3 ) 



the likelihood of e^ given p,o and the observations. Differen- 
tiate the log of (3.3) and equate to zero to obtain the non-linear 
estimating equation 



£ . 

. 3 _ 



( s . + — — 

nr 



2 » £ 
3 



(3.4) 



The nature of the estimate is appreciated if we let £^(1) = ln(Sj/tj) 
be an initial solution (putting s jAj = l/(3t^) if s^ = 0) , and pass 
through one Newton-Raphson iteration to obtain 



£ . 
3 



s . In (Sj/t j ) + p ./o' 
s . + (l/o 2 ) 



(3.5) 



Since a delta-method approximation to Var [ In ( s . /t . ) |A. = A.] 
is just 1/Ajtj = 1/s, the estimate (3.5) is the linearly weighted 
shrinkage of the log-rate estimate towards the assumed super- 
population mean, p, with weights the reciprocals of the within 
and between (superpopulation) variances, familiar in the normal/ 
Gaussian distribution Bayes ian-conj ugate prior framework. 

The above estimates require values for superpopulation param- 

2 



eters u. = u + x.B and o 
3 -3“ 



Once these are at hand, values can be 



computed for e ^ . Desirably, ).u and o can be estimated from data 
on all J items' histories. Various options exist for this; some 
are proposed and explored in a later section. 
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3.2 Gamma Superpopulation 

Consider the convenient (conjugate) alternative model 



A . = U . exp ( Bx . ) 

D 3 * — D 



where now all is as before except that U. 

3 

that the density is, for u > 0, 



Gamma(6,a), meaning 



f ( u ; 6 , a ) 




, . . a- 

(6u) 

T(a) 




(3.6) 



This model has a long history, for it leads to the negative 
binomial marginal distribution of event counts: 



P{ s = k I x} 



E[e- A( y )t 



[A (y) t] lc 1 
k! J 



/ e ~A (u) t 

0 



[A (u) t] k 
k! 



f (u; 6 ,a) du 



(3.7) 



r (k+a) . 6 v 

k ! T (a) v 6 + t exp ( 8_ x) 



t exp (B_ x) k 
^ 6 + t exp ( B^ x) ^ 



and to a simple linear formula for the Bayes estimate of the 
individualized rate 



Aj E [ A j l s j,tjXj] 



s j exp ( Bx ^ ) + a exp (B_ Xj ) 
t j exp (B_ Xj ) +6 



(3.8) 



Since for the gamma. 



rriy = E[U] 



a 

6 



U 



Var(U] = —Tr- 



ot 
2 ' 



(3.9) 
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then 



m 



6 = 



U 

Var [U] ; 



m, 



a = 



U 

Var [ U] ' 



the above expression can be expressed as 



exp (f? x . ) t . 
( 3 1 



m, 



A . 
3 



U 



’■•t 2 ’ + vadUl <m U exp ^ *j > > 

] i_ 

exp(f? x . ) t . 1 

2 1 + 



(3.10) 



m, 



U 



Var [U] 



which is again a linear shrinking of the raw point estimate (s^/t^) 
towards the appropriate superpopulation mean, here expressed as 
m^ exp (J^x^) ; the weights are again recognizable as within 
(m^/exp ( £5 Xj ) t ^ ) and between (Var [ U] ) variance components. 

Note that the random log-linear model (3.1) is only one 
suggestive model form. Conceivably random individualization should 
be of multiplicative form: In A ^ 6_Xj , rather than additive, 

so that covariate influence varies from item to item. Other possi- 
bilities also exist, but regression effects are not considered 
further in this paper. 
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4 . Discrepancy-Tolerant Versions of the Log-Normal (L/N/P) Model: 
The Log Sculptured-Normal Poisson (L/S-N/P) . 

It is natural to generalize the L/N/P model as follows. Again 



put 



A . 
~ 3 



exp ( e j ) 



( 4 . 1 , a) 



but now 



e. = lnx. = u.+od>(z.) 

- 3 ~3 J -3 



y j +o Zji^(Zj) , 



( 4 . 1 ,b) 



where y ^ = U + 3_ x ^ , z ^ ~ N/(0,1), and ip ( z ^ ) = <j> ( z ^ ) /z ^ is a 
sculpturing function . See Gaver (1983) , and also Hoaglin ( 1983) 
for an account of certain such functions in the normal (Gaussian) 
context, attributable to Tukey, see (1974) . Call <J>(Zj) a 
sculptured normal . The purpose of (4.1,b) is to describe stretch - 
tailed distributions of log-rates, i.e. those that exhibit quite 
widely straggling exotic or extreme values, or outliers, both above 
and below the normal-like central part. Illustrations of well- 
behaved distributions of rates, as contrasted to straggling tailed 
distributions, and even multi-modal distributions, appear in Figs. 
1, 2, and 3. Our methods are aimed at dealing with the effects 
of Fig. 2, and, to an extent. Fig. 3. 

Here are some convenient sculpturings of the normal. 



(A) cj>(z;n) = T ^T ^n-2 [exp ( z^ (— — - 1 ] , a pseudo-t; 

|z| (n-1) 

an explicitly-invertible approximation to a true Student t 
with unit variance if n > 2. See Gaver and Kafadar (1984) . 
This representation is used extensively later in the paper. 
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(B) <j>(z,n) = (/(n-2)/n)t , a true Student t with n deg. fr., 
again with unit variance if n > 2. Parenthetically, use of 
a finite variance t is in no way essential. 

2 

(C) <|>(z;h) = ( l-4h) ^^ze * 12 , 0 £ h < ^ a member of Tukey's h-family; 
see Hoaglin (1983) , and Tukey (1974) . 

There are other interesting and convenient such forms that give 

promise of providing multi-modal parametric representations; see 

Cobb (1983) . All of our above forms yield distributions of 

2 

e = In A that are unimodal and symmetric, have variance o , and 
are more stretch-tailed than the basic unit normal, z. As will 
be seen, this latter qualitative modification has a beneficial 
effect upon the rate estimator, reducing the tendency for indis- 
criminate shrinkage of apparently very discrepant observations. 

4.1 Nearly Explicit Discrepancy-Tolerant (Controlled Shrinkage) 
Estimators of Rates. 

The several sculptured normal representations just presented 
yield directly interpretable rate estimates by way of approximate 
likelihood maximization, as in (3.3) . Results for options (A) , 

(B) , and (C) are sketched. 

(A) : Invert the pseudo-t to find 

exp ( -z 2 /2 ) = (1 +^ 7 ) _(n_1) / 2 ( n " 3 / 2 > 

Now the log-likelihood associated with observation j is 
proportional to 
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(B) : 



(C) : 



£ j Uj ;P j /Q 



■2T^Wln (1+ ^i) - 



+ s . In A ( <j> . ) , 
3 3 



(4.2) 



and is expressed in terms of the pseudo t individualization 
<(>j = <J>(Zj). Differentiation then yields the estimating 
equation for A ^ , or <J> ^ : if = y ^ + a<J> ^ , then 



A . 

3 



e . 
> 3 



(s. - ( 3 ? 3 )w . (n) ) 
3 o 3 



(4.3) 



where 



Wj (n) 



(n-1) 

(n- 3/2) (n-2) 



1 + ( 



e . -y . 2 
3 3 ' 1 






n-2 



(4.4) 



The log-likelihood is very similar to (4.2); one obtains 
only a slightly different weight: 



A . 
3 



e . 
„ 3 



y . -e . , 

(s. - ( 3 ? J ) w. (n) ) — 
3 cr 3 j 



(4.5) 



with 



, \ ,n+l. 

w.(n) = ( — — - k ) 

j n- z 



1 + 2 

a n-2 



(4.6) 



In this case the convenient individualization parameter is 



a normal deviate, z_. , where (Hz J = z j ex P(hz^) 
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A . 
3 



and 



y+o<)) (z . ) 
> -J 



Wj (h) 



{s j - z i w j (h)) tT 

3 



( 4 . 7 ) 



(1 - 4h ) 



-3/4 



, hz 2 
(1 + 2hz ^ ) e 3 



(4.8) 



It can be seen that (4.3), (4.5), and (4.7) all can on occasion 

have two real solutions, corresponding to the possibility of two 
modes in the likelihood, or Bayesian posterior, for <f> ^ . Strict 
adherence to likelihood doctrine would force computation of each 
solution and a check to see which globally maximizes likelihood-- 
a possible but tedious task. The same is true of a Laplace method 
approach, which requires modification to account for the bimodality, 
Consequently it is proposed to simply modify the estimate-dependent 

weights (4.4) and (4.6) to estimated weights that utilize 

/\ 

the raw-data estimate ln(s./t ) in place of <b . , so in each case 

3 3 ^ 3 



Wj (n) 



C ( n) 



In (s ./t . ) - m . 2 . 

1 + ( L r J !) (zA*r) 



(4.9) 



n-2' 



which can be computed once; this modification leads to a unique, 
approximately Bayes, solution with reasonable properties. Of 
course both likelihood (or posterior) bimodality or the occurrence 
of a small weight value may suggest the need for model changes or 
other action. 

Notice that in each case the weight w_. (n) modifies the resulting 
estimate towards discrepancy tolerance: a value of the individualization 
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parameter far from the appropriate superpopulation mean y^ exerts 
a small influence on the shrinkage term, so the quoted estimate 
A. tends to be nearly the raw rate s./t... Interestinqly , all of 
these extended tail models induce weights in excess of unity on 
observations with log-raw-rates close enough to y ; this might be 
called over-shrinkage , and is noticeable in Table 1, p. 79 of Berger 
(1984), wherein a normal likelihood is combined with a Cauchy (one 
d.f. t) prior. This very same effect has been pointed out by Tukey 
in (1974) , p. 132. 

An interpre table approximation to these log rate estimates is 
obtained by the following linearization: in (4.3) or (4.6) start 

A 

with e.(l) = ln(s./tj) and turn the Newton-Raphson crank once, but 

A 

evaluate w^ (n) at e^d), i.e. utilize (4.9). Then 



(s . ) ln(s ./t . )+ (y ./o )w . (n) 



In A.. (1) 



Ejd) = 



J. 



3 _ 



(Sj) + (l/o )Wj(n) 



(4.10) 



The term (s^) is the delta-method estimate of (Var [ In (s ^/t ^ ) ] ) ^ , 

so the estimate quoted is seen to be nearly a linear combination 
of the raw rate estimate and the individualized mean, with the 
shrinkage towards the latter influenced by the discrepancy 

A 

( In ( s j/t j ) -y j ) /a as reflected in the weights w^ (n) ; small dis- 
crepancies tend to shrink the estimate towards the mean, while 
large discrepancies are tolerated, i.e. left largely without 
shrinkage so that the quoted estimate is nearly In A ^ - log(Sj/tj). 
The parameter n or h in the superpopulation models is not estimated 
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at this point, but treated as a tuning parameter: the smaller n 
(n > 2) , or larger h (h < 0.25) the greater the effect of the 
weights upon discrepant observations. In practice, n = 4 has 
given satisfactory performance, as will be suggested by simulations 
and trial data analyses. For a similar analysis procedure in the 
more classical robustness context see the biweights for estimation 
of a distributional center, Mosteller and Tukey (1977), p. 353 

A 

ff . ; our weight w^ (n) is essentially an influence function, with 
degree of observational influence adjusted by choice of n or h , 
corresponding to the parameter c in biweight technology. 
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5 . Superpopulation Parameter Estimation 



In order to provide suitable pooled, shrunken, estimates of an 
individual item rate it is necessary to invoke estimates of super- 
population parameters. Unfortunately, the superpopulation variate 
values are not observed directly, but are contaminated by "Poisson 
noise;" this complicates the task of parameter estimation. 

Natural approaches to the estimation problem are through 
moment matching, maximum likelihood or Bayesian approaches. We 
suggest variations on these themes that require different degrees 
of computer-intensive effort, and are of differing effectiveness. 

The methods advanced for consideration have been compared by simula- 
tions. The limited histories typical in various fields, e.g. 
in reliability and survival studies, and in nuclear plant risk, 
do not encourage faith in the validity of asymptotic error analyses 
without such corroboration. 



5.1 Likelihood Estimation for the Log Sculptured-Normal Poisson 
(L/S-N/P) Model 

Suppose that a time history of length t ^ results in s^ events 
for item j (j = 1,2,...,J). The data is to be analyzed with 
reference to the general L/S-N/P model of (4.1) , but for the 
present y^ = y, a constant; regression will be discussed later. 

Method 1; Likelihood By Gauss-Hermi te Integration. 

2 

The likelihood of the parameter y and o given the data and 
the L/S-N/P model can be expressed as 



L ( y , c ; s , t ) 



a 2 

n L . ( y , a , s . , t . ) 

j=l J J J 



(5.1, a) 
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where 



Lj (U ,o‘ 



W 



CO 




_1 2 

e "2 Z -A ( z) t . 

— e J 

/2tt 



s . 

(A(z)t.) 3 

s . ! 

D 



dz , 



( 5 . 1 ,b) 



and 



log A(z) = y + cu}>(z) = y + ozip(z) (5.1,c) 

Throughout what follows the sculpturing function, <p or equiva- 
lently 4>/ is assumed given. 

The integrals (5.1,b) cannot be carried out analytically. 

-i-z 2 

Owing to the appearance of e ^ , the use of Gauss-Hermite 

numerical integration is suggested. In the notation of Abramowitz 
and Stegun (1968) , 



/2tt Lj ( y ,o ; s j , t j ) 



e X f ^ (x) dx 



: I 



W . f . (x . ) 

1 D 1 



where z = /2x and 



f (x i ) 



-A ( /2x . ) t . _ s . 

e 1 J ( (A (/2x i ) t.) 3 




(5.2) 



the x., w. values are taken from tables. A grid search among 

l i 

2 

y, a values then reveals the approximate maximum likelihood 
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solutions from (5.1). It has been discovered that care is required 



histories. Also, the straightforward integration is numerically 
ill-conditioned . 

Method 2: Quadratic Approximation (Laplace’s Method Approach). 

An appealing approximation to the integral (5.1,b) is obtained 

t h 

by expressing of the j — likelihood component as 



2 

in the choice of the jj , a start when analyzing a few short data 



2 

L j ( 1J , o ;Sj,tj) 




oo 




(5.4) 



— OO 



/2tt 



where, except for irrelevant parameter-free constants. 




A ( z) t j - s j In A ( z ) , 



(5.5) 



which has a qualitatively bowl-shaped appearance. Hence 



quadratically approximate Q p (z) as follows: 





(5.6) 






(5.7) 



so 





(5. 8, a) 
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or 



ln(s./t.) - y 

*<V - — — 



(5. 8,b) 



namely 



, ln(s./t.) - y 

9j = T 1 ( V , . 



( 5. 8 ,c] 



Clearly Q p j ( 0 j ) is independent of y and cj and can be ignored, 
while 



Qpj (0j) = s.o 2 [<t>' (0 j ) ] 2 , 



(5.9) 



and so the approximate likelihood is of the form 



L j ( y , o ) 



- ! -y e -2W (z -v 2 

/ -= « 5 — dz/Q" (0 ) 

> /2tt /2tt ^ ^ 



= exp [-ie^/ (1 + (Q” (0 •) ) 1 )) 

Z J J 



A + (Qpj ( 6 j ) ) 



(5.10) 



-1 



Specific adoption of sculpturing option (A) , the pseudo-t, pro- 
vides that 



e : - 



' . 2 <{) (0 . ) I , 

T^w lo 9 II * (-^1 «(e.) 



(5.11) 



and 
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°PD <0 D ) = s j° 2 I4> ' ( 6j ) ] 2 



_ 2 2 (n-2) ( n-3/2) 1 + ^ (9 j )/(n ~ 2) 2 

J V (^TT 5- *<V 



(5.12) 



each of which is evaluated from 
the normal superpopulation case 
approximation treats log(Sj/t^) 



(5.8,b) . Note that if n ■> « 

is obtained, and the present 

2 

as approximately N(y,o +1/Sj) 



then 



Method 3: Quadratic Approximation to the Log-Likelihood, 

Augmented by Gauss-Hermi te Integration 

The previous method blithely approximates the Poisson log- 
likelihood by a quadratic in order to achieve convenient and 
interpretable results. Consider next a more careful approach that 
combines Methods 1 and 2. To do so, express the entire exponent 
in the integrand of ( 5 . 4 ) --essentially the negative log-likelihood-- 
as 



Qj ( z ) = jz 2 + Q p (z) 



jz 2 + tjA(z) - Sj lnA(z) 



(5.18) 



Let 0 . be a solution of Q! (z) = 0. NOte that this solution 
D 3 

will not be as explicit as before, and so an approximate value 

A 

may sometimes be most conveniently used; call it 0 ^ . Now expand 
up to quadratic terms and let 

Q.(z) = q (z) + Q ( 0 j ) + (z-e^Q^Sj) + \ ( z- 0 . ) 2 Q(’ ( 0 . ) (5.19) 



where 



qAz) = Q (z) -Q (0 ) - (z-e.)Q! (0 -)-y(z-0 .) 2 Q"(0 .) ; 

J J JJ J J J Z J J J 



(5.20) 
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regard q^(z) as the error in the quadratic approximation. Next 
complete the square in the quadratic terms of (5.19) and put 



x 



— [ (z-e .)/q"( 8 ) + Q' (0 .)//Q"(e •) ] 
/2 3 3 3 9 



(5.21) 



and hence 



z (x) 



/2 x 
/Q" ( 0 j ) 



Q ' ( 0 • ) 
QMgj) + 9 j 



(5.22) 



It thus follows that 



[ Q ' ( 6 • ) ] 2 - 

Q (z(x)) = q . ( z (x ) ) i + x z , (5.23) 

J J 2Q" ( 9 .) 



and that the likelihood assumes the form 



2 - [Q!(0.)] 

L (v,o ) = K exp(-Q (0 )+ — i-J ) - 

3 3 3 3 2 QV(0 ) /Q^Oj) 0 



I.(p,o 2 ), (5.24) 



where 



Ij(U,0 ) 



,°° _ x 2 -q • ( z ( x) ) 

/ e e 3 dx , 



(5.25) 



and Kj is a constant independent of parameters p and o . The 
idea is that subtraction of the quadratic approximation from 



Q.(z) should leave a relatively minor correction, I., to be 
3 3 
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evaluated by Gauss-Hermi te integration. Finally, then, the log- 
likelihood is of the form 



£ 



(M/0 ) 



, J - [Q ’ ( 0 . ) ] 2 , 

Y l {-Q(0 • ) + l y inQ" (6-) + In I.} 

J j=l 3 2Q" (0 . ) Z 3 3 



(5.26) 



which can be examined for maxima over y and a > 0. NOte that 

/\ 

if the equation Qj(z) is in fact solved precisely, then Q'(0.) = 0, 

A 2 A 

and the correction term [Q'(0.)] /2Q"(0j) may be omitted. 

/N 

An important part of the computation involves finding 0 ^ , an 
approximate solution to the equation 



Qj (z) 



= 0 



= Z + A'(z)t. - S. K-r 

3 3 X(2) 



Parametrization by the sculptured form In A(z) = y + o0(z) leads 
to the equation 



0 = z + o(tjA(z) - Sj)<J>'(z) 



(5. 27, a) 



or 



0 = z + o (tje M+0<J> (z) - Sj)tf>’(z) ( 5 . 27 , b) 

At this point it becomes desirable to introduce a specific, 
tractable, parametric form for <p(z) . Consult (4.3) : option (A) , 

a pseudo-t, is handy, and will be adopted. Here are some details. 
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Option (A) (pseudo-t) . 



For this representation, 



Q'(z) = aSz ( 1 + £ /a ) 



(5.28) 



where a = n-2, 3 = (n-3/2) / (n-1) from Gaver and Kafadar (1984) 
requires solving the following equation for <J> : 



y+o<}) _ 



= (s . - 



(n-1) *<J>/o 



j (n-2) (n-3/2)(l + <J> 2 /(n-2)) fc j 



(5.29 



this has arisen earlier as (4.3) in the context of finding an 

individualized estimate of A . . 

3 

To simplify calculations, one may initially take as an 
approximate solution 



♦j = 



ln(s_./t_.) - y 



(5.30) 



Reference to (4.3) then shows that 



0. = (\ / ^ In ( 1 + ^— ) ) sign (4> . ) 

3 V p a 3 



(5.31) 



Furthermore , 



for this determination of 0 ^ , 



~ 1 ~ 2 

0(0,) = y(0,) 

Q'(0j) = 0 j 



(5.32) 



(5.33) 
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(5.34) 



l+(4> .) /a 



Q" ( 0 • ) = l+os.(ctB0.[ 1 ]) , 



♦j 



and now (5.26) can be evaluated for any p , a . The resulting 

J 2 

approximate log likelihood, £ log L.()j,o ) , with L. as in (5.24) 

j = l 3 3 

and above can be explored for maxima by numerical search. Alterna- 



tively, numerical integration of exp(-£(y,o )), see (5.26), with suitab 

2 



( non-informative) priors for p and o results in Bayes estimates; 
this ambitious numerical step has not yet been carried out. 

A more detailed investigation may begin by precise determination 
of solutions to (5.29). Graphical analysis quickly reveals the 



possibility of three real solutions, two of which identify local 

2 



maxima. It can be seen that, given p and o , the solution in 
terms of e = (<{>-p)/a of the equation (4.5) modified with the 



weight (4.9) is a reasonable choice for 4> ^ , which is then converted 



to 6 j by (4.3). Now 



Q ' ( 0 j ) = e j + (A < 0 j ) t -Sj)<f>' ( O j ) , 



(5.35) 



and 



Q" (0 j ) 



= 1 + 



o 2 t. 



(<P 



( 0 j ) ) 



o ( X ( 0 j ) t j 



- s.) ■• ( e j ) 



(5.36) 



where 



<J>'(0.) = aB — 1 — (1 + 4> 2 ( 0 • ) /a) 

D <M0.) 3 



(5.37) 



and 
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<f>" (0 •) = — ( ag ( 1 + 4> 2 (0.)/a) (1 + a80^) - (4>* (9-) ) 2 } . (5.38) 

3 <J) ( 0 j ) 3 3 3 



The Q . -derivatives are 
3 

expression is searched 



introduced into (5.24), and the resulting 

2 

for maximizing values of p and o . 
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6 . Simulation Testing 



The procedures described for carrhing out the estimation of 

2 

superpopulation center (y) and variance (o ), and the individualized, 

or selectively shrunken, estimates of rates, e.g. (4.3), have 

been appraised by simulations. In summary, the procedures 

described deliver satisfactory results; empirical distributions of 

2 

estimates of y and a appear to center close to the values input, 
and the average distance of selectively shrunken individualized 
rate estimates from the true (distance being mean squared error, 
median absolute deviation) often improve upon obvious competitors. 

Of course these statements apply only to the range of parameter 
values studied, which illustrate those encountered in certain 
nuclear power system probabilistic risk assessments. A brief 
selection of many simulations appears in the following tables and 
figures . 



6.1 Simulation Design. 

A simulation requires specification of the superpopulation form 



and parameters, the sample size, the exposure times t_. (j = 1,2,...,J) 



and the sculpturing function <}>(•) used in rate production. Note 
that the latter function need not — and here will not — be the same 
as that used to construct individualized rate estimates. 

Specification of the present simulation follows. 

(a) Superpopulation form is a sculptured normal form (C) , the 
Tukey h family, from which actual or "true" rates are easily con- 



structed ; 



. th 






^ j j = exp [y + o<f>(z ( ^ ) J / where z ^ j ^ being the j — ordered o 



increasing magnitude) unit normal, where 0(z^_.^) = z ^ ^ j exp (hz ^ j ) . 
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Simulation of a sample of J begins by obtaining J unit normal 
deviates, ordering them , and computing A^, j = 1,2,...,J. 

(b) Having ''■(j)' and having specified t ^ , J realizations of 

independent Poisson random variables, or counts, are generated; 

s . corresponds to mean A , . , t . . 

3 ( D ) 3 

(c) The simulated observations (s^t^; j = 1,2,...,J) are analyzed 

according to the L/S-N/P model by Method 3 to obtain point esti- 

2 

mates of superpopulation parameters y and a . The pseudo-t 
option (A) is used in the likelihood; parameter n has been treated 
as a tuner, either specified as low, e.g. n = 4, yielding highly 
restricted or selective shrinkage, or as high, e.g. n = 50, corres- 
ponding nearly to the more conventional log normal model. A 

2 

numerical search procedure has been utilized to locate y and o 
values . 



(d) Individualized estimates A^ 

2 

using estimated y and o from (c) : 



are computed by these options. 



MLE ; 
SSP : 
RSP : 



A 




Sj/tj (6.1) 

solution of (4.3) using w^.(n) = 1 (6.2) 

solution of (4.3) using the (n) of (4.9); 

(6.3) 



The abbreviations are, respectively, for maximum likelihood, 
simple shrinkage Poisson, and restricted shrinkage Poisson. 

2 

(e) Each case (J,y,o , h ,( t^ ), n) combination is independently 
simulated 200 times. 
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(e) Each case, i.e. (J,y,a ,h,tj,n) combination, is independently simu- 
lated 200 times for Table 6.1, and 100 times for Table 6.2. The 
mean-squared errors of estimators (6.1), (6.2), (6.3) are sum- 

marized — summary figures (sample mean, median, standard deviation, 
median absolute deviation, and mean squared error) for errors of 
estimate of the smallest true rate , the median true rate 

A 

A J+1 , and the largest, A ^ jj : A^ -A^j. Choice of these three 

values for true A is enough to reveal the different behaviors of 
the candidate estimators : in general SSP and RSP both greatly 

improve upon simple MLE for centrist (median) A values by borrowing 
strength, while RSP improves upon SSP at true A extremes by refusing 
to overshrink. 

Tables 6.1 and 6.2 summarize illustrative sets of simulation re- 
sults. Note that the estimates of the superpopulation mean, p , appear 

2 

close to being unbiased, while those of the variance o appear to be 
consistently biased downwards in Table 6.1 (J = 15 ), and about 
right in Table 6.2 (J = 45) . Standard error of estimate (square- 
roots of the variances of the empirical distributions of the correspond- 
ing parameters) are, not surprisingly, substantial; as is sensible, 

i 

they decrease as J increases. Nevertheless, comparison of the simu- 
lated MSE figures for the various estimators suggest that RSP, 
especially for n = 4, has advantages: for the smallest rates, A^, 

and the largest, A^^, RSP ' s mse more nearly resembles the MLE 
mse performance than does the more heavily shrunken SSP, particularly 
for n = 50 and 75 which imitate the action of a log-normal analysis; 

for middle values, A, oa and A /oow both RSP and SSP estimates 

( o ) -i ) 

shrink moderately, all improving substantially upon the MLE. 
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Table 6.1 



Selected Mean Squared Error Comparisons 
and Estimated Superpopulation Parameters 





J = 15, h = 0.15 


, 200 Simulations 












Mean Squared Error 




True 

Values 


Estimated 


Estimator 


A ^ (small) 


A ^ (median) 


A (15) {lar 9 e) 




(n = 4) : y = -0.97(0.41) 


PSP 


0.016 


0.019 


0.33 


y = -1.0 


o 2 = 0.17(0.15) 


SSP 


0.020 


0.020 


0.34 


o 2 = 0.25 




MLE 


0.007 


0.030 


0.32 




A 

(n = 50) : y = -0.98(0.45) 


RSP 


0.019 


0.020 


0.35 




o 2 = 0.18(0.15) 


SSP 


0.019 


0.020 


0.35 





(n = 4) : 


/s 

y = -1.93(0.50) 


RSP 


0.0050 


0.0060 


0.28 






a 2 = 0.18(0.17) 


SSP 


0.0060 


0.0058 


0.30 


y = -2.0 














a 2 = 0.25 






MLE 


0.0026 


0.014 


0.27 




(n = 50) : 


A 

y = -1.93(0.52) 


RSP 


0.0053 


0.0057 


0.30 






o 2 = 0.20(0.18) 


SSP 


0.0054 


0.0057 


0.30 
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Table 6.2 



Selected Mean Squared Error Comparisons 
and Estimated Superpopulation Parameters 





J = 45, h = 


0.10, t. 

3 


= 5; 100 


Simulations 




True 

Values 


Estimated 


Estimator 


A^ (small) 


A ^ 3 ) (median) 


A ( 45 ) (large) 




(n = 4) : y =0.50(0.25) 


RSP 


0.030 


0.067 


2.65 




o 2 =0.41(0.29) 


SSP 


0.050 


0.067 


2.75 


p =0.50 
o 2 =0.35 




MLE 


0.011 


0.13 


2.61 




(n=75) :p =-0.56(0.30) 


RSP 


0.042 


0.069 


2.68 




a 2 =0.44(0.28) 


SSP 


0.044 


0.069 


2.71 
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7. 



Analysis of Data 



Our methodology will now be applied to several sets of observa- 
tional failure or event rate data. In each case estimated super- 
population parameters p and are quoted, as are the unpooled 
maximum likelihood individual rate estimates (MLE) , the simple 
linearized shrunken (Bayes) estimates (SSP) , and the discrepancy- 
tolerant restricted shrinkage estimates (RSP) , along with the 
weights associated with each of the latter. It appears that the 
results so obtained contrast interestingly, with the RSP behaving 
in the discrepancy- tolerant manner anticipated, and with small weights 
influencing this behavior, especially in data sets for which J is 
substantial . 

7.1 Ship System Failure Rates 

The numbers of failures during one year experienced by each of 
J = 254 individual systems aboard a Navy ship have been furnished 
by Dr. R. Coile. It is provisionally assumed that all systems are 
exposed to failure throughout time, and that the failure process is 
nearly Poisson; neither assumption can be checked, but the analysis 
is of interest. The analytical results are in Table 7.1. 

The heavy preponderance of zero and one-failure systems is 
recognized by both SSP and RSP, which nearly agree: both shrink 

in the same direction at this level. However, SSP continues to 
shrink towards low rate values even for the two units with relatively 
high observed rates, while RSP is far more discrepancy- tolerant , as 
dictated by the corresponding low weights. It is interesting that 

our procedures estimate, on the basis of a log-normal superpopulation, 

A ]_~2 

an expected failure rate of exp ( p + jo ) = 0.54, close to the pooled 
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TABLE 7.1 



SHIP SYSTEM FAILURE RATES 





A 

M = -1. 


~2 

34, a = 


1.46, J = 


254 




Number of 
Units 


Failures 


MLE 


SSP 

(n =75) 


RSP 
(n =4) 


WT. 


178 


0 


0.00 


0.20 


0.22 


1.8 


48 


1 


1.00 


0.52 


0.50 


1.1 


16 


2 


2.00 


1.0 


1.2 


0 . 75 


3 


3 


3.00 


1.7 


2.2 


0 . 59 


6 


4 


4.00 


2 . 5 


3.1 


0 . 51 


1 


5 


5.00 


3.3 


4.2 


0.45 


1 


9 


9.00 


6.8 


8.2 


0 . 34 


1 


11 


11.00 


8.6 


10.2 


0.31 
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observed rate of 0.50. The median failure rate is exp(p) = 0.26, 
and since over one-half of the observations are zeros, it is also 
encouraging that the estimated median rate is near the estimated 
rate for systems with zero failures. Finally, a calculation of 
the probability of zero events for a gamma superpopulation moment- 
matched to the log-normal with the observed parameters yields 
0.75, which may be compared to the observed fraction 178/254 = 0.70. 
Although the simple calculation is perhaps crude, the agreement 
is gratifying. 

7.2 Loss of Feedwater Flow 

Table 7.2 presents a set of data referring to the rates of 
loss of feedwater flow for a collection of nuclear power genera- 
tion systems; see Kaplan (1983). The corresponding SSP, RSP 
derived rates are included. Once again, the units with very small 
(< 0.5) weights, namely Systems 1, 3, 7, 18, and 19, all display marked 
differences between RSP and SSP, with the resulting SSP estimates exhibit- 
ing shrinkage upwards far more extensive than those of the corresponding RSP. 
Systems 11 and 23 have the highest observed rates, both have about 
the same times of exposures and nearly the same computed weights, 
and both RSP estimates are slightly less shrunken, thus closer to 
the MLE , than are those from SSP. The estimated median rate, 
calculated on the basis of a log-normal superpopulation, is 
exp(y) = 2.56; the fraction of MLE rates equal to or exceeding 
2.6 is 16/30 = 0.53; the corresponding fractions of SSP and RSP 
rates is 15/30 = 0.5, in good agreement. 
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19 
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25 

26 

27 

28 
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Table 7.2 



Loss of Feedwater Flow Rates 

/V A ^ A A A A 

1 J = 0.94 <j = 0.31; A= exp(p + o /2) = 2.99 



s ( j ) 


t ( j ) 


MLE 


SSP (n =75) 


RSP (n = 4) 


WT. 


4 


15 


0.27 


0.59 


0.35 


0.19 


40 


12 


3.3 


3.3 


3.2 


1.6 


0 


8 


0.041 


0.59 


0.087 


0.063 


10 


8 


1.3 


1.5 


1.5 


0.98 


14 


6 


2.3 


2.4 


2.4 


1.8 


31 


5 


6.2 


5.7 


5.8 


0.30 


2 


5 


0.4 


1.0 


0.64 


0.27 


4 


4 


1.0 


1.5 


1.4 


0.74 


13 


4 


3.3 


3.1 


3.0 


1.7 


4 


3 


1.3 


1.7 


1.8 


1.1 


27 


4 


6.8 


6.1 


6.2 


0.71 


14 


4 


3.5 


3.3 


3.2 


1.6 


10 


4 


2.5 


2.5 


2.5 


1.8 


7 


2 


3.5 


3.2 


3.1 


1.6 


4 


3 


1.3 


1.7 


1.8 


1.1 


3 


3 


1.0 


1.5 


1.5 


0.74 


11 


2 


5.5 


4.6 


4 . 6 


0.92 


1 


2 


0.5 


1.4 


1.0 


0.34 


0 


2 


0.17 


1.2 


0.41 


0.14 


3 


1 


3.0 


2.8 


2.7 


1.7 


5 


1 


5.0 


3.8 


3.7 


1.0 


6 


1 


6.0 


4.3 


4.5 


0.83 


35 


5 


7 . 0 


6.4 


6.6 


0.68 


12 


3 


4.0 


3.6 


3.5 


1.4 


1 


1 


1.0 


1.9 


1.8 


0.74 


10 


3 


3.3 


3.1 


3.0 


1.6 


5 


2 


2.5 


2.5 


2.5 


1.8 


16 


4 


4.0 


3.7 


3.6 


1.4 


14 


3 


4.7 


4.1 


4.1 


1.1 


58 


11 


5.3 


5.1 


5.1 


0.98 
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7.3 Globe Valve Leak Failures. 



These data pertain to nuclear power plant globe valve leak- 
mode failures, categorized according to operator type, the source 
being NPRDS data; see Hill, et_ al^ (1984), p. 9, where the results 
of a gamma superpopulation analysis are presented and discussed. 

The data are categorized by operator type, and the between-category 
variability is described by a superpopulation. A more appropriate 
analysis would presumably be (known) category by category, with 
between-system variability within each known category described 
by a superpopulation. Table 7.3 describes the data and estimates, 
but also includes the gamma estimates of Hill et al . 

Table 7.3 

Globe Valve Leak Failure Rates 
y(4) = 0.040, o 2 (4) = 1.1, p (7 5) = 0.18, 0^(75) = 1.08 

Gamma SSP RSP 



Category 


s ( j ) 


t ( j ) 


MLE 


PEB 


(n = 75) 


(n = 4) 


WT . 


1 


31 


236.9 


0.131 


0.134 


0 .138 


0.136 


0 .60 


2 


157 


115.9 


1 .35 


1 .35 


1 .35 


1 .35 


1.7 


3 


30 


36.8 


0.815 


0.823 


0.816 


0.825 


1 . C 


4 


13 


7 . 60 


1.71 


1.67 


1 . 63 


1 . 62 


1 . 6 


5 


7 


5.47 


1.28 


1.27 


1 .22 


1.23 


1 . 8 


6 


7 


1 . 69 


4 . 14 


3 .35 


3 . 38 


3 . 51 


0.96 


7 


0 


1 . 12 


0.00 


0 .411 


0 .47 


0 . 55 
(0.50) 


1 . G 
(0.83) 


8 


0 


0.55 


0 . 00 


0.559 


0.59 


0 .78 
(0.65) 


1 . 6 
(0.83) 
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The gamma-PEB and the present SSP-RSP methodologies give 
rather comparable results for the first six categories. The 
last two categories differ more strikingly, with the SSP-RSP 
procedure shrinking somewhat more extensively than the gamma, 
the RSP weights, especially that for category 8, are surprisingly 
high; this is believed to be the result of the necessity of 
approximating the assessment of discrepancy on the log-scale for 
s 7 = Sg = 0. If the experiences for categories 7 and 8 are pooled 
in order to compute weights, then the perhaps more acceptable 
numbers in parentheses result. Although the weights placed on 
the apparently discrepant rates for categories 1, 7, and 8 are 
not as striking as might be wished, they are of interest. It 
must be recognized that J = 8 is a very small group or "sample." 
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